On the Computational Economics of Reinforcement Learning
نویسندگان
چکیده
Following terminology used in adaptive control , we distinguish between indirect learning methods, which learn explicit models of the dynamic structure of the system to be controlled , and direct learning methods, which do not. We compare an existing indirect method, which uses a conventional dynamic programming algorithm, with a closely related direct reinforcement learning method by applying both methods to an innnite horizon Markov decision problem with unknown state-transition probabilities. The simulations show that although the direct method requires much less space and dramatically less computation per control action, its learning ability in this task is superior to, or compares favorably with, that of the more complex indirect method. Although these results do not address how the methods' performances compare as problems become more diicult, they suggest that given a xed amount of computational power available per control action, it may be better to use a direct reinforcement learning method augmented with indirect techniques than to devote all available resources to a computation-ally costly indirect method. Comprehensive answers to the questions raised by this study depend on many factors making up the economic context of the computation.
منابع مشابه
Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach
Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...
متن کاملA Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem
Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...
متن کاملBounded rationality via recursion
Current trends for model construction in the field of Agentbased Computational Economics base behavior of agents on either game theoretic procedures (e.g. belief learning, fictitious play, Bayesian learning) or are inspired by theArtificial Intelligence (e.g. reinforcement learning). Evidence from experiments with human subjects puts the first approach in doubt, whereas the second one imposes s...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کامل